nlp_architect.data.cdc_resources.relations.wikipedia_relation_extraction.WikipediaRelationExtraction

class nlp_architect.data.cdc_resources.relations.wikipedia_relation_extraction.WikipediaRelationExtraction(method: nlp_architect.data.cdc_resources.relations.relation_types_enums.WikipediaSearchMethod = <WikipediaSearchMethod.ONLINE: 'online'>, wiki_file: str = None, host: str = None, port: int = None, index: str = None)[source]
__init__(method: nlp_architect.data.cdc_resources.relations.relation_types_enums.WikipediaSearchMethod = <WikipediaSearchMethod.ONLINE: 'online'>, wiki_file: str = None, host: str = None, port: int = None, index: str = None) → None[source]

Extract Relation between two mentions according to Wikipedia knowledge

Parameters:
  • method (optional) – WikipediaSearchMethod.{ONLINE/OFFLINE/ELASTIC} run against wiki site a sub-set of wiki or on a local elastic database (default = ONLINE)
  • wiki_file (required on OFFLINE mode) – str Location of Wikipedia file to work with
  • host (required on Elastic mode) – str the Elastic search host name
  • port (required on Elastic mode) – int the Elastic search port number
  • index (required on Elastic mode) – int the Elastic search index name

Methods

__init__(method, wiki_file, host, port, index) Extract Relation between two mentions according to Wikipedia knowledge
extract_aliases(pages1, pages2, titles1, titles2) Check if input mentions has aliases relation
extract_all_relations(mention_x, mention_y) Try to find if mentions has anyone or more of the relations this class support
extract_be_comp(pages1, pages2, titles1, titles2) Check if input mentions has be-comp/is-a relation
extract_category(pages1, pages2, titles1, …) Check if input mentions has category relation
extract_disambig(pages1, pages2, titles1, …) Check if input mentions has disambiguation relation
extract_parenthesis(pages1, pages2, titles1, …) Check if input mentions has parenthesis relation
extract_relation(mention_x, mention_y, relation) Base Class Check if Sub class support given relation before executing the sub class
extract_sub_relations(mention_x, mention_y, …) Check if input mentions has the given relation between them
get_phrase_related_pages(mention_str) Get all WikipediaPages pages related with this mention string
get_supported_relations() Return all supported relations by this class
is_both_opposite_personal_pronouns(phrase1, …) check if on phrase refere to female while the other to a male and vise versa
is_part_of_same_name(pages1, pages2) Check if input mentions has part of same name relation (eg: page1=John, page2=Smith)
is_redirect_same(pages1, pages2) Check if input mentions has same wikipedia redirect page
static extract_aliases(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has aliases relation

Parameters:
  • pages1 – WikipediaPages
  • pages2 – WikipediaPage
  • titles1 – Set[str]
  • titles2 – Set[str]
Returns:

RelationType.WIKIPEDIA_ALIASES or RelationType.NO_RELATION_FOUND

extract_all_relations(mention_x: nlp_architect.common.cdc.mention_data.MentionDataLight, mention_y: nlp_architect.common.cdc.mention_data.MentionDataLight) → Set[nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType][source]

Try to find if mentions has anyone or more of the relations this class support

Parameters:
  • mention_x – MentionDataLight
  • mention_y – MentionDataLight
Returns:

One or more of: RelationType.WIKIPEDIA_BE_COMP,

RelationType.WIKIPEDIA_TITLE_PARENTHESIS, RelationType.WIKIPEDIA_DISAMBIGUATION, RelationType.WIKIPEDIA_CATEGORY, RelationType.WIKIPEDIA_REDIRECT_LINK, RelationType.WIKIPEDIA_ALIASES, RelationType.WIKIPEDIA_PART_OF_SAME_NAME

Return type:

Set[RelationType]

static extract_be_comp(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has be-comp/is-a relation

Parameters:
  • pages1 – WikipediaPages
  • pages2 – WikipediaPage
  • titles1 – Set[str]
  • titles2 – Set[str]
Returns:

RelationType.WIKIPEDIA_BE_COMP or RelationType.NO_RELATION_FOUND

static extract_category(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has category relation

Parameters:
  • pages1 – WikipediaPages
  • pages2 – WikipediaPage
  • titles1 – Set[str]
  • titles2 – Set[str]
Returns:

RelationType.WIKIPEDIA_CATEGORY or RelationType.NO_RELATION_FOUND

static extract_disambig(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has disambiguation relation

Parameters:
  • pages1 – WikipediaPages
  • pages2 – WikipediaPage
  • titles1 – Set[str]
  • titles2 – Set[str]
Returns:

RelationType.WIKIPEDIA_DISAMBIGUATION or RelationType.NO_RELATION_FOUND

static extract_parenthesis(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has parenthesis relation

Parameters:
  • pages1 – WikipediaPages
  • pages2 – WikipediaPage
  • titles1 – Set[str]
  • titles2 – Set[str]
Returns:

RelationType.WIKIPEDIA_TITLE_PARENTHESIS or RelationType.NO_RELATION_FOUND

extract_relation(mention_x: nlp_architect.common.cdc.mention_data.MentionDataLight, mention_y: nlp_architect.common.cdc.mention_data.MentionDataLight, relation: nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType

Base Class Check if Sub class support given relation before executing the sub class

Parameters:
  • mention_x – MentionDataLight
  • mention_y – MentionDataLight
  • relation – RelationType
Returns:

relation in case mentions has given relation and

RelationType.NO_RELATION_FOUND otherwise

Return type:

RelationType

extract_sub_relations(mention_x: nlp_architect.common.cdc.mention_data.MentionDataLight, mention_y: nlp_architect.common.cdc.mention_data.MentionDataLight, relation: nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has the given relation between them

Parameters:
  • mention_x – MentionDataLight
  • mention_y – MentionDataLight
  • relation – RelationType
Returns:

relation in case mentions has given relation or

RelationType.NO_RELATION_FOUND otherwise

Return type:

RelationType

Get all WikipediaPages pages related with this mention string

Parameters:mention_str – str
Returns:WikipediaPages
static get_supported_relations() → List[nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType][source]

Return all supported relations by this class

Returns:List[RelationType]
static is_both_opposite_personal_pronouns(phrase1: str, phrase2: str) → bool[source]

check if on phrase refere to female while the other to a male and vise versa

Returns:bool
is_part_of_same_name(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages) → bool[source]

Check if input mentions has part of same name relation (eg: page1=John, page2=Smith)

Parameters:
  • pages1 – WikipediaPages
  • pages2 – WikipediaPage
Returns:

bool

static is_redirect_same(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages) → bool[source]

Check if input mentions has same wikipedia redirect page

Parameters:
  • pages1 – WikipediaPages
  • pages2 – WikipediaPage
Returns:

bool